Secure Runtime Environment (SRE)
A hardened, compliance-ready Kubernetes platform for deploying applications in regulated environments. One-click deploy, zero-trust security, full observability β all open source.
What You Get
A complete Kubernetes platform with 16 integrated components, all deployed and managed through GitOps:

| Category | Components | What It Does |
|---|---|---|
| Service Mesh | Istio | Encrypts all pod-to-pod traffic (mTLS), controls who can talk to whom |
| Policy Engine | Kyverno | Blocks insecure containers, enforces image signing, requires labels |
| Monitoring | Prometheus + Grafana + Alertmanager | Metrics, dashboards, and alerting for the entire cluster |
| Logging | Loki + Alloy | Centralized log collection and search from every pod |
| Tracing | Tempo | Distributed request tracing across services |
| Runtime Security | NeuVector | Detects and blocks anomalous container behavior in real time |
| Secrets | OpenBao + External Secrets Operator | Centralized secrets vault with automatic Kubernetes sync |
| Certificates | cert-manager | Automated TLS certificate issuance and rotation |
| Identity | Keycloak | Single sign-on (SSO) with OIDC/SAML for all platform UIs |
| Registry | Harbor + Trivy | Container image storage with vulnerability scanning on push |
| Backup | Velero | Scheduled cluster backup and disaster recovery |
| Load Balancer | MetalLB | Provides LoadBalancer IPs on bare metal (cloud uses native LB) |
| GitOps | Flux CD | Continuously reconciles cluster state from this Git repo |
Accessing the Platform
All platform UIs are exposed through a single Istio ingress gateway on standard HTTPS (port 443). No custom ports needed.
Step 1: Add DNS entries
Get the gateway IP and add DNS entries:
# Get the gateway's external IP (assigned by MetalLB on bare metal, or cloud LB on AWS/Azure)
GATEWAY_IP=$(kubectl get svc istio-gateway -n istio-system -o jsonpath='{.status.loadBalancer.ingress[0].ip}')
# Add to /etc/hosts (or configure real DNS in production)
echo "$GATEWAY_IP portal.apps.sre.example.com dashboard.apps.sre.example.com grafana.apps.sre.example.com prometheus.apps.sre.example.com alertmanager.apps.sre.example.com harbor.apps.sre.example.com keycloak.apps.sre.example.com neuvector.apps.sre.example.com openbao.apps.sre.example.com oauth2.apps.sre.example.com" | sudo tee -a /etc/hosts
How it works: The Istio ingress gateway gets a dedicated IP via LoadBalancer (MetalLB on bare metal, cloud LB on AWS/Azure). When a request arrives on port 443, Istio reads the
Hostheader and routes it to the correct backend service via VirtualService rules. All traffic is TLS-encrypted with a wildcard certificate for*.apps.sre.example.com.
Step 2: Open any service
All URLs follow the pattern: https://<service>.apps.sre.example.com
| Service | URL | Default Credentials |
|---|---|---|
| Portal | https://portal.apps.sre.example.com |
SSO via Keycloak |
| Dashboard | https://dashboard.apps.sre.example.com |
SSO via Keycloak |
| Grafana | https://grafana.apps.sre.example.com |
SSO via Keycloak (or admin / prom-operator) |
| Prometheus | https://prometheus.apps.sre.example.com |
SSO via Keycloak |
| Alertmanager | https://alertmanager.apps.sre.example.com |
SSO via Keycloak |
| Harbor | https://harbor.apps.sre.example.com |
SSO via Keycloak (or admin / Harbor12345) |
| Keycloak | https://keycloak.apps.sre.example.com |
admin / (auto-generated, see below) |
| NeuVector | https://neuvector.apps.sre.example.com |
SSO via Keycloak (or admin / admin) |
| OpenBao | https://openbao.apps.sre.example.com |
SSO via Keycloak |
SSO: All services (except Keycloak itself) are behind Single Sign-On. Clicking any link redirects you to Keycloak to log in once, then you're authenticated across all services.
Your browser will warn about the self-signed certificate β click through it or use
curl -k.
Step 3: Get credentials
# Show all service URLs and credentials
./scripts/sre-access.sh
# Just credentials
./scripts/sre-access.sh creds
# Health check
./scripts/sre-access.sh status
How the Networking Works
Internet / LAN
β
ββββββββΌβββββββ
β LoadBalancer β Dedicated IP (MetalLB / cloud LB)
β :443 :80 β Standard HTTPS/HTTP ports
ββββββββ¬βββββββ
β
ββββββββββΌβββββββββ
β Istio Gateway β TLS termination
β (istio-system) β Host-based routing
ββββββββββ¬βββββββββ
β
βββββββββββββββββΌββββββββββββββββ
β β β
ββββββΌβββββ ββββββΌβββββ ββββββΌβββββ
β Grafana β β Harbor β β Your Appβ
β :3000 β β :8080 β β :8080 β
βββββββββββ βββββββββββ βββββββββββ
Traffic flow for https://grafana.apps.sre.example.com:
1. DNS resolves to the gateway's LoadBalancer IP
2. HTTPS hits port 443 on that IP
3. Istio Gateway terminates TLS using the wildcard certificate
4. Istio reads the Host: grafana.apps.sre.example.com header
5. VirtualService rule matches and routes to kube-prometheus-stack-grafana.monitoring.svc:80
6. Grafana serves the response back through the same path
User Walkthrough
Here's exactly what it looks like when you use the platform, from first login to deploying an app.
1. SSO Gate β Every Service is Protected
Visit any URL and you're redirected to sign in. One login, access everywhere.

2. Keycloak Login
Enter your credentials (default: sre-admin / SreAdmin123!). Once signed in, you're authenticated across all services.

3. Portal β Your Starting Point
The portal is your home page. It shows all platform services with health status, quick actions, and direct links.

4. Dashboard β Platform Overview
Click "Dashboard" from the portal. See all 16 components, 3 nodes, and problem pods at a glance.

5. Services Tab β Direct Links to Everything
Browse all services with health indicators, descriptions, and one-click access.

6. Deploy Tab β One-Click App Deployment
Quick-start templates for instant demos, or use the custom form to deploy your own image.

7. Status Page β Shareable Health View
Operational status of every platform service. Share this URL with your team.

8. Audit Log β Cluster Events
Tabular view with type filters, namespace filter, pagination, and color-coded badges.

9. Credentials β Quick Access to Passwords
View service credentials without needing kubectl.

10. Command Palette (Ctrl+K)
Quick-search to jump to any page or external service.

11. Grafana β 30+ Dashboards
Cluster health, namespace resources, Istio traffic, Kyverno violations, and more.

12. Harbor β Container Registry
Image storage with Trivy vulnerability scanning on push.

13. Keycloak Admin β Identity Management
Manage users, groups, OIDC clients, and SSO configuration.

14. Mobile Responsive
Portal and dashboard adapt to mobile screens for on-the-go health checks.

Full user stories with walkthroughs: See docs/user-stories.md for detailed personas and step-by-step workflows for Platform Admins, Developers, Security Officers, Team Leads, New Hires, and Incident Responders.
Quick Start
Deploy to Any Existing Kubernetes Cluster
If you already have a Kubernetes cluster with kubectl access:
git clone https://github.com/morbidsteve/sre-platform.git
cd sre-platform
./scripts/sre-deploy.sh
The script handles everything: storage provisioning, kernel modules, Flux CD bootstrap, secret generation, and waits until all components are healthy (~10 minutes).
When it finishes:
./scripts/sre-access.sh # Show all URLs and credentials
Deploy from Scratch on Proxmox VE
Build a full cluster from bare metal:
git clone https://github.com/morbidsteve/sre-platform.git
cd sre-platform
./scripts/quickstart-proxmox.sh
See the Proxmox Getting Started Guide for details.
Deploy on Cloud (AWS, Azure, vSphere)
git clone https://github.com/morbidsteve/sre-platform.git
cd sre-platform
# 1. Provision infrastructure
task infra-plan ENV=dev
task infra-apply ENV=dev
# 2. Harden OS + install RKE2
cd infrastructure/ansible
ansible-playbook playbooks/site.yml -i inventory/dev/hosts.yml
# 3. Deploy the platform
cd ../..
./scripts/sre-deploy.sh
Deploy Your App
Option A: Web Dashboard (30 seconds)
- Open
https://dashboard.apps.sre.example.com - Click Deploy App
- Fill in: name, team, image, tag, port
- Click Deploy
The platform automatically adds security contexts, network policies, Istio mTLS, health probes, and Prometheus monitoring.
Option B: CLI
# Create a team namespace (one-time)
./scripts/sre-new-tenant.sh my-team
# Deploy your app (interactive)
./scripts/sre-deploy-app.sh
# Push to Git β Flux handles the rest
git push
Option C: GitOps (manual YAML)
Create apps/tenants/my-team/my-app.yaml:
apiVersion: helm.toolkit.fluxcd.io/v2
kind: HelmRelease
metadata:
name: my-app
namespace: team-my-team
spec:
interval: 10m
chart:
spec:
chart: ./apps/templates/sre-web-app
reconcileStrategy: Revision
sourceRef:
kind: GitRepository
name: flux-system
namespace: flux-system
values:
app:
name: my-app
team: my-team
image:
repository: nginx
tag: "1.27-alpine"
port: 8080
ingress:
enabled: true
host: my-app.apps.sre.example.com
Commit and push β Flux deploys it automatically.
Container Requirements
Your container must:
- Run as non-root (UID 1000+)
- Listen on port 8080+ (not 80 or 443)
- Use a pinned version tag (not :latest)
Can't run as non-root? Use
nginxinc/nginx-unprivilegedinstead ofnginx, or addUSER 1000to your Dockerfile.
Architecture
SRE is composed of four layers:
βββββββββββββββββββββββββββββββββββββββββββββββββββ
β Layer 4: Supply Chain Security β
β Harbor + Trivy scanning + Cosign signing β
β + Kyverno image verification β
βββββββββββββββββββββββββββββββββββββββββββββββββββ€
β Layer 3: Developer Experience β
β Helm templates + Tenant namespaces β
β + SRE Dashboard + GitOps app deployment β
βββββββββββββββββββββββββββββββββββββββββββββββββββ€
β Layer 2: Platform Services (Flux CD) β
β Istio + Kyverno + Prometheus + Grafana + Loki β
β + NeuVector + OpenBao + cert-manager + Keycloak β
β + Tempo + Velero + External Secrets β
βββββββββββββββββββββββββββββββββββββββββββββββββββ€
β Layer 1: Cluster Foundation β
β RKE2 (FIPS + CIS + STIG) on Rocky Linux 9 β
β Provisioned by OpenTofu + Ansible + Packer β
βββββββββββββββββββββββββββββββββββββββββββββββββββ
Layer 1 β Cluster Foundation: Infrastructure provisioned with OpenTofu (AWS, Azure, vSphere, Proxmox VE), OS hardened to DISA STIG via Ansible, RKE2 installed with FIPS 140-2 and CIS benchmark.
Layer 2 β Platform Services: All security, observability, and networking tools deployed via Flux CD. Every component is a HelmRelease in Git, continuously reconciled to the cluster.
Layer 3 β Developer Experience: Standardized Helm chart templates and self-service tenant namespaces. Developers deploy apps by committing a values file β the platform handles security contexts, network policies, monitoring, and mesh integration.
Layer 4 β Supply Chain Security: Images scanned by Trivy, signed with Cosign, verified by Kyverno at admission, monitored at runtime by NeuVector.
Security Controls
Every request passes through multiple security layers:
Request β TLS Termination β JWT Validation β Authorization Policy β Network Policy β Istio mTLS β Application
β
NeuVector Runtime Monitor
GitOps Flow
All changes flow through Git:
Developer β git push β GitHub β Flux CD detects change β Kyverno validates β Helm deploys β Pod running
No kubectl apply needed. No manual cluster access. Git is the single source of truth.
Zero-Trust Security Model
Every layer enforces security independently β compromising one layer doesn't bypass the others:
| Layer | Control | What It Prevents |
|---|---|---|
| Gateway | Istio ext-authz + OAuth2 Proxy | Unauthenticated access to any service |
| Mesh | Istio mTLS STRICT | Unencrypted pod-to-pod communication |
| Network | NetworkPolicy default-deny | Lateral movement between namespaces |
| Admission | Kyverno 7 policies | Privileged containers, unsigned images, :latest tags |
| Runtime | NeuVector | Anomalous process execution, network exfiltration |
| Secrets | OpenBao + ESO | Hardcoded credentials, secret sprawl |
| Audit | Prometheus + Loki + Tempo | Unmonitored activity, missing forensic data |
Platform Components
Component Versions (as deployed)
| Component | Chart Version | Namespace |
|---|---|---|
| Istio (base + istiod + gateway) | 1.25.2 | istio-system |
| cert-manager | v1.14.4 | cert-manager |
| Kyverno | 3.3.7 | kyverno |
| kube-prometheus-stack | 72.6.2 | monitoring |
| Loki | 6.29.0 | logging |
| Alloy | 0.12.2 | logging |
| Tempo | 1.18.2 | tempo |
| OpenBao | 0.9.0 | openbao |
| External Secrets | 0.9.13 | external-secrets |
| NeuVector | 2.8.6 | neuvector |
| Velero | 11.3.2 | velero |
| Harbor | 1.16.3 | harbor |
| Keycloak | 24.8.1 | keycloak |
| MetalLB | 0.14.9 | metallb-system |
Kyverno Policies (7 active)
| Policy | Mode | What It Enforces |
|---|---|---|
disallow-latest-tag |
Enforce | Blocks :latest image tags |
require-labels |
Enforce | Requires app.kubernetes.io/name and sre.io/team labels |
require-network-policies |
Enforce | Ensures every namespace has a default-deny NetworkPolicy |
require-security-context |
Enforce | Requires non-root, drop ALL capabilities |
restrict-image-registries |
Enforce | Restricts images to approved registries |
require-istio-sidecar |
Audit | Requires Istio sidecar injection labels |
verify-image-signatures |
Audit | Verifies Cosign signatures on images |
Secrets Management
| Feature | Implementation |
|---|---|
| Secrets Vault | OpenBao (auto-initialized, auto-unsealed) |
| K8s Integration | External Secrets Operator syncs from OpenBao to K8s Secrets |
| Auth Method | Kubernetes ServiceAccount-based authentication |
| Engines | KV v2 (app secrets), PKI (certificates) |
SSO / Identity (Keycloak + OAuth2 Proxy)
All platform services are protected by SSO via Keycloak + OAuth2 Proxy + Istio ext-authz. A single login grants access to every service.
| Feature | Detail |
|---|---|
| Realm | sre with OIDC discovery |
| Groups | sre-admins, developers, sre-viewers |
| OIDC Clients | Grafana, Harbor, NeuVector, Dashboard, OAuth2 Proxy |
| SSO Gate | OAuth2 Proxy + Istio ext-authz on the gateway |
| Test User | sre-admin / SreAdmin123! (in sre-admins group) |
How SSO works:
1. User visits any service URL (e.g., https://grafana.apps.sre.example.com)
2. Istio gateway sends the request to OAuth2 Proxy for authentication check
3. If no valid session, OAuth2 Proxy redirects to Keycloak login page
4. User logs in once with Keycloak credentials
5. OAuth2 Proxy sets a session cookie valid across all *.apps.sre.example.com services
6. All subsequent requests pass through automatically β no more logins needed
Observability
| Feature | Detail |
|---|---|
| Grafana Dashboards | 5 custom SRE dashboards (cluster, namespace, istio, kyverno, flux) + 31 built-in |
| PrometheusRules | 22 alerts across 8 groups (certs, flux, kyverno, nodes, storage, pods, security, istio) |
| Alertmanager | Severity-based routing (critical/warning/info) with inhibition rules |
CI/CD Pipeline
Reusable GitHub Actions workflows in ci/github-actions/:
- Build container image with Docker Buildx
- Scan with Trivy (fail on CRITICAL)
- Generate SBOM with Syft (SPDX + CycloneDX)
- Sign with Cosign
- Push to Harbor
- Update GitOps repo (Flux auto-deploys)
Compliance Artifacts
| Artifact | Path |
|---|---|
| OSCAL System Security Plan | compliance/oscal/ssp.json |
| NIST 800-53 Control Mapping | compliance/nist-800-53-mappings/control-mapping.json |
| CMMC 2.0 Level 2 Assessment | compliance/cmmc/level2-assessment.json |
| RKE2 DISA STIG Checklist | compliance/stig-checklists/rke2-stig.json |
Project Structure
sre-platform/
βββ platform/ # Flux CD GitOps manifests
β βββ flux-system/ # Flux bootstrap
β βββ core/ # Core platform components
β β βββ istio/ # Service mesh (mTLS, gateway, auth)
β β βββ cert-manager/ # TLS certificates
β β βββ kyverno/ # Policy engine
β β βββ monitoring/ # Prometheus + Grafana + Alertmanager
β β βββ logging/ # Loki + Alloy
β β βββ tracing/ # Tempo
β β βββ openbao/ # Secrets vault
β β βββ external-secrets/ # Secrets sync to K8s
β β βββ runtime-security/ # NeuVector
β β βββ backup/ # Velero
β βββ addons/ # Optional components
β βββ harbor/ # Container registry
β βββ keycloak/ # Identity / SSO
βββ apps/
β βββ portal/ # SRE Portal β tiled landing page for all services
β βββ dashboard/ # SRE Dashboard web app (v2.0.2)
β βββ demo-app/ # Go demo app with Prometheus metrics
β βββ templates/ # Helm chart templates (web-app, worker, cronjob, api)
β βββ tenants/ # Per-team app configs (team-alpha, team-beta)
βββ ci/
β βββ github-actions/ # Reusable CI/CD workflows (build, scan, sign, deploy)
βββ policies/ # Kyverno policies + test suites
βββ infrastructure/
β βββ tofu/ # OpenTofu modules (AWS, Azure, vSphere, Proxmox)
β βββ ansible/ # OS hardening + RKE2 install
β βββ packer/ # Immutable VM image builds
βββ compliance/ # OSCAL, STIG checklists, NIST mappings
βββ scripts/ # Deploy, access, and management scripts
βββ docs/ # Full documentation
Compliance
SRE targets these government and industry compliance frameworks:
| Framework | Coverage |
|---|---|
| NIST 800-53 Rev 5 | AC, AU, CA, CM, IA, IR, MP, RA, SA, SC, SI control families |
| CMMC 2.0 Level 2 | All 110 NIST 800-171 controls |
| DISA STIGs | RKE2 Kubernetes, RHEL 9 / Rocky Linux 9, Istio |
| FedRAMP | NIST 800-53 control inheritance + OSCAL artifacts |
| CIS Benchmarks | Kubernetes (via RKE2), Rocky Linux 9 Level 2 |
Every Kyverno policy, Helm chart, and Flux manifest includes sre.io/nist-controls annotations mapping to specific NIST 800-53 controls.
Scripts Reference
| Script | Description |
|---|---|
scripts/sre-deploy.sh |
One-button platform install on any K8s cluster |
scripts/sre-access.sh |
Show all service URLs, credentials, and health status |
scripts/sre-access.sh status |
Quick health check (all HelmReleases + problem pods) |
scripts/sre-access.sh creds |
Show credentials for all platform services |
scripts/sre-new-tenant.sh <team> |
Create a team namespace with RBAC, quotas, network policies |
scripts/sre-deploy-app.sh |
Interactive app deployment (generates HelmRelease) |
apps/dashboard/build-and-deploy.sh |
Build and deploy the SRE Dashboard to the cluster |
Documentation
| Guide | Description |
|---|---|
| Architecture | Full platform spec and design rationale |
| User Stories | Personas, walkthroughs, and screenshots for every user type |
| Decision Records | ADRs for all major technology choices |
| Developer Guide | Deploy your app, secrets management, SSO, CI/CD |
| Proxmox Guide | Build a cluster from scratch on Proxmox VE |
| Session Playbook | Step-by-step build plan |
| CI/CD Pipelines | Reusable GitHub Actions for build/scan/sign/deploy |
| Istio AuthZ Policies | Zero-trust network policies |
Contributing
Branch naming: feat/, fix/, docs/, refactor/ prefixes
Commit format: Conventional Commits β feat(istio): add strict mTLS peer authentication
Requirements:
- task lint and task validate must pass
- Every component needs a README.md
- All Kyverno policies need test suites
- All Helm charts need values.schema.json
- Never use :latest tags β pin specific versions
- Never commit secrets or credentials
License
Apache License, Version 2.0. See LICENSE.